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(54) File Formats, Methods, and Computer Program Products for Representing Workbooks 



(57) File formats, methods, and computer program 
products are provided for representing a workbook in a 
modular content framework. The modular content frame- 
work may include a file format container associated with 
modular parts. A file format includes logically separate 
modular parts that are associated with each other by one 
or more relationships where each modular part is asso- 
ciated with a relationship type. The modular parts include 
a workbook part operative as a guide for properties of 
the workbook and a worksheet part associated with the 



workbook part and operative to specify a definition of 
cells within a worksheet associated with the worksheet 
part. The modular parts may also include a document 
properties part containing built-in properties associated 
with the file format and a thumbnail part containing as- 
sociated thumbnails. Each modular part is capable of be- 
ing interrogated separately, extracted from the work- 
book, and/or reused in a different workbook. 
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Description 

CROSS-REFERENCE TO RELATED APPLICATIONS 

[0001] This patent application is related to and filed 
with U.S. Patent Application, Attorney Docket No. 
60001. 0441 US01, entitled "File Formats, Methods, and 
Computer Program Products For Representing Docu- 
ments," filed on December 20, 2004; U.S. Patent Appli- 
cation, Attorney Docket No. 60001. 0443US01, entitled 
"File Formats, Methods, and Computer Program Prod- 
ucts For Representing Presentations, "filed on December 
20, 2004; and Attorney Docket No. 60001 .0440US01 , 
entitled "Management and Use of Data in a Computer- 
Generated Document," filed on December 20, 2004; all 
of which are assigned to the same assignee as this ap- 
plication. The aforementioned patent applications are ex- 
pressly incorporated herein, in their entirety, by refer- 
ence. 

TECHNICAL FIELD 

[0002] The present invention generally relates to file 
formats, and more particularly, is related to methods and 
file formats for representing features and data of work- 
books in a componentized spreadsheet application pro- 
gram. 

BACKGROUND 

[0003] The information age has facilitated an era of 
building informative spreadsheets utilizing spreadsheet 
software applications. However, the organization of fea- 
tures and data within previous spreadsheet file formats 
is very confusing and unclear to outside programmers 
and developers. For instance, previous spreadsheet file 
formats are created in the form of a single file using a 
binary record format containing all of the information re- 
quiredtorenderworkbooks. Because proprietaryformats 
are generally used to create these single files, writing 
code to work with and access these file formats without 
using the application program that created the file format 
is a nightmare for professional developers. 
[0004] Another problem is basic document or work- 
sheet re-use. For instance, it is very difficult to extract 
one or more worksheets from a workbook file and reuse 
the extracted worksheets in a different workbook and re- 
tain worksheet integrity, even in the same application. 
Comparatively, reusing worksheets between different 
applications is worse. Reusing content on a worksheet, 
for example reusing a table from EXCEL to WORD, is 
similarly difficult. 

[0005] Additionally, because of the single file format, 
it is practically impossible to lock part of a workbook. Most 
of the technology in terms of file locking is all done at the 
file level, thus if a file is locked by a user, no other users 
can edit the file. Viewing is possible, but not editing. 
[0006] There is also a problem of document interroga- 
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tion. Finding content within a workbook file, for example 
finding worksheets for a 2004 sales forecast, can be a 
daunting task. It is difficult to write code that program- 
matically finds cell A1 of a spreadsheet file and deter- 

s mines the contents of that cell (a string value, a formula, 
a calculated result) without using the same spreadsheet 
application that created the workbook. It is also very dif- 
ficult to find parts of a single file format presentation and 
determine semantics about the content. For example, it 

10 is difficult to write code that programmatically locates a 
list of data in a spreadsheet application, and adds 3 rows 
of data to the list without using that spreadsheet applica- 
tion. 

[0007] It is still difficult to implement reader and writer 
is classes that can handle existing binary file formats well. 
Even if a tool targeted at an application was developed 
it could not interrogate all document formats. This prob- 
lem is referred to as the opaq ueness of single fi le formats. 
[0008] Still further, due to intermingling of data, the 
20 ability to re-brand a worksheet, or multiple worksheets, 
is nearly impossible outside of the same spreadsheet 
application. Re-branding a worksheet involves taking a 
worksheet from workbook A, moving it to workbook B, 
and making the worksheet look as though it was authored 
25 in the normal authoring context of workbook B having the 
same text-based format. 

[0009] Document surfacing, the ability to take pieces 
of a worksheet document and drop them into another 
document of a different application, is also a problem. 
30 For instance, a spreadsheet table copied into a presen- 
tation document is difficult to interrogate in the single file 
format. 

[0010] Accordingly there is an unaddressed need in 
the industry to address the aforementioned deficiencies 
35 and inadequacies. 

SUMMARY 

[001 1 ] Embodiments of the present invention provide 

40 file formats, methods, and computer program products 
for representing a workbook in a modular content frame- 
work implemented within a computing apparatus, Em- 
bodiments of the present invention disclose an file format 
based on open standards, such as an extensible markup 

45 language (XML) file format and/or a binary file format, 
and a method by which features and data of a workbook 
are organized and modeled within a spreadsheet appli- 
cation file. The file format is designed such that it is made 
up of collections and parts. Each collection functions as 

50 a folder and each modular part functions as a file. These 
separate files are related together with relationships 
where each separate file is associated with a relationship 
type. This design greatly simplifies the way the spread- 
sheet applications organizes workbook features and da- 

55 ta, and presents a logical model that is much less con- 
fusing. 

One embodiment is a file format for representing a work- 
book in a modular content framework. The modular con- 
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tent framework may include a file format container asso- 
ciated with the modular parts. The file format includes 
modular parts that are logically separate but associated 
with one another by one or more relationships. Each mod- 
ular part is associated with a relationship type and the 
modular parts include a workbook part operative as a 
guide for properties of the workbook. The modular parts 
also include a worksheet part associated with the work- 
book part and operative to specify a definition of cells 
within a worksheet, a sheet part containing data associ- 
ated with a macro sheet, a chart sheet part containing 
data associated with defining a chart, and/or 
a dialog sheet part containing data associated with work- 
book dialog. Each modular part is capable of being inter- 
rogated separately with or without the spreadsheet ap- 
plication and without other modular parts being interro- 
gated, which offers gains in efficiency when the workbook 
is queried. 

[001 2] The modular parts may also include a document 
properties part containing built-in properties associated 
with the file format and a thumbnail part containing one 
or more thumbnails associated with the file format. Each 
modular part is capable of being extracted from and/or 
copied from the workbook and reused in a different work- 
book along with associated modular parts identified by 
traversing the relationships of the modular part reused. 
[001 3] Another embodiment is a method for represent- 
ing a workbook in a file format wherein modular parts 
associated with the workbook include each part written 
into the file format. The method involves writing a work- 
book part of the file format, querying the workbook for a 
worksheet relationship type, and writing a worksheet part 
of the file format separate from the workbook part. The 
method also involves establishing a relationship between 
the worksheet part and the workbook part. Additionally, 
the method may involve establishing a relationship be- 
tween the workbook part and a file format container 
where thefileformatcontainerincludesadocument prop- 
erties part containing built-in properties associated with 
the file format and a thumbnail part containing a thumb- 
nail associated with the file format. 
[0014] Still further, the method may involve writing 
modular parts associated with relationship types wherein 
the modular parts that are to be shared are written only 
once, and establishing relationships to the modular parts 
written. Writing the modular parts may also involve ex- 
amining data associated with the workbook, determining 
whetherthe data examined has been written to a modular 
part, and when the data examined has not been written 
to the modular part, writing the modular part to include 
the data examined. 

[0015] Still another embodiment is a computer pro- 
gram product including a computer-readable medium 
havi ng co ntrol logicstored therein forcausing a computer 
to represent a workbook in a file format where modular 
parts of the file format include each part written into the 
file format. The control logic includes computer-readable 
program code forcausing the computer to write a work- 
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book part of the file format, query the workbook for a 
worksheet relationship type, write a worksheet part of the 
file format separate from the workbook part, and establish 
a relationship between the worksheet part and the work- 
s book part. 

[0016] The invention may be implemented utilizing a 
computer process, a computing system, or as an article 
of manufacture such as a computer program product or 
computer readable media. The computer program prod- 
10 uct may be a computer storage media readable by a com- 
puter system and encoding a computer program of in- 
structions for executing a computer process. The com- 
puter program product may also be a propagated signal 
on a carrier readable by a computing system and encod- 
es ing a computer program of instructions for executing a 
computer process. 

[0017] These and various other features, as well as 
advantages, which characterize the present invention, 
will be apparent from a reading of the following detailed 
20 description and a review of the associated drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0018] 

FIGURE 1 is a computing system architecture illus- 
trating a computing apparatus utilized in and provid- 
ed by various illustrative embodiments of the inven- 
tion; 

30 FIGURES 2a-2c are block diagrams illustrating a 
workbook relationship hierarchy forvarious modular 
parts utilized in a file format for representing a work- 
book according to various illustrative embodiments 
of the invention; and 

35 FIGURES 3-4 are illustrative routines performed in 
representing workbooks in a modularcontent frame- 
work according to illustrative embodiments of the in- 
vention. 



[0019] Referring now to the drawings, in which like nu- 
merals represent like elements, various aspects of the 
present invention will be described. In particular, FIGURE 

45 1 and the corresponding discussion are intended to pro- 
vide a brief, general description of a suitable computing 
environment in which embodiments of the invention may 
be implemented. While the invention will be described in 
the general context of program modules that execute in 

so conjunction with program modules that run on an oper- 
ating system on a personal computer, those skilled in the 
art will recognize that the invention may also be imple- 
mented in combination with othertypes of computer sys- 
tems and program modules. 

55 [0020] Generally, program modules include routines, 
programs, operations, components, data structures, and 
other types of structures that perform particular tasks or 
implement particular abstract data types. Moreover, 
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those skilled in the art will appreciate that the invention 
may be practiced with other computer system configura- 
tions, including hand-held devices, multiprocessor sys- 
tems, microprocessor-based or programmable consum- 
er electronics, minicomputers, mainframe computers, 
and the like. The invention may also be practiced in dis- 
tributed computing environments where tasks are per- 
formed by remote processing devices that are linked 
through a communications network. In a distributed com- 
puting environment, program modules may be located in 
both local and remote memory storage devices. 
[0021] Referring nowto FIGURE 1 , an illustrative com- 
puter architecture for a computer 2 utilized in an embod- 
iment of the invention will be described. The computer 
architecture shown in FIGURE 1 illustrates a computing 
apparatus, such as aserver, desktop, laptop, or handheld 
computing apparatus, including a central processing unit 
5 ("CPU"), a system memory 7, including a random ac- 
cess memory 9 ("RAM") and a read-only memory 
("ROM") 1 1 , and a system bus 12 that couples the mem- 
ory to the CPU 5. A basic input/output system containing 
the basic routines that help to transfer information be- 
tween elements within the computer, such as during star- 
tup, is stored in the ROM 11. The computer 2 further 
includes a mass storage device 14 for storing an oper- 
ating system 16, application programs, and other pro- 
gram modules, which will be described in greater detail 
below. 

[0022] The mass storage device 14 is connected to 
the CPU 5 through a mass storage controller (not shown) 
connected to the bus 12. The mass storage device 14 
and its associated computer-readable media provide 
non-volatile storage for the computer 2. Although the de- 
scription of computer-readable media contained herein 
refers to a mass storage device, such as a hard disk or 
CD-ROM drive, it should be appreciated by those skilled 
in the art that computer-readable media can be any avail- 
able media that can be accessed by the computer 2. 
[0023] By way of example, and not limitation, compu- 
ter-readable media may comprise computer storage me- 
dia and communication media. Computer storage media 
includes volatile and non-volatile, removable and non- 
removable media implemented in any method or tech- 
nology for storage of information such as computer-read- 
able instructions, data structures, program modules or 
other data. Computer storage media includes, but is not 
limitedto, RAM, ROM, EPROM, EEPROM, flash memory 
or other solid state memorytechnology, CD-ROM, digital 
versatile disks ("DVJS'), or other optical storage, mag- 
netic cassettes, magnetic tape, magnetic disk storage or 
other magnetic storage devices, or any other medium 
which can be used to store the desired information and 
which can be accessed by the computer 2. 
[0024] According to various embodiments of the inven- 
tion, the computer 2 may operate in a networked envi- 
ronment using logical connections to remote computers 
through a network 1 8, such as the Internet. The computer 
2 may connect to the network 1 8 through a network in- 



terface unit 20 connected to the bus 1 2. It should be ap- 
preciated that the network interface unit 20 may also be 
utilized to connectto othertypes of networks and remote 
computer systems. The computer 2 may also include an 

s input/output controller 22 for receiving and processing 
input from a number of other devices, including a key- 
board, mouse, or electronic stylus (not shown in FIGURE 
1). Similarly, an input/output controller 22 may provide 
outputto a display screen, a printer, or othertype of output 

10 device. 

[0025] As mentioned briefly above, a number of pro- 
gram modules and data files may be stored in the mass 
storage device 1 4 and RAM 9 of the computer2, including 
an operating system 1 6 suitable for controlling the oper- 

is ation of a networked personal computer, such as the 
WINDOWS XP operating system from MICROSOFT 
CORPORATION of Redmond, Washington. The mass 
storage device 1 4 and RAM 9 may also store one or more 
program modules. In particular, the mass storage device 

20 14 and the RAM 9 may store a spreadsheet application 
program 10. The spreadsheet application program 10 is 
operative to provide functionality for the creation and 
structure of workbooks, such as a workbook 27, in an 
open fileformat 24, such as an XMLfile format. According 

25 to one embodiment of the invention, the spreadsheet ap- 
plication program 10 and other application programs 26 
comprise the OFFICE suite of application programs from 
MICROSOFT CORPORATION including the WORD, 
EXCEL, and POWERPOINT application programs. 

30 [0026] Embodiments of the present invention greatly 
simplify and clarify the organization of workbook features 
and data. The spreadsheet program 1 0 organizes the 
'parts' of a workbook file (features, data, themes, styles, 
objects, etc) into logical, separate pieces, and then ex- 

35 presses relationships among the separate parts. These 
relationships, and the logical separation of 'parts' of a 
workbook, make up a new file organization that can be 
easily accessed, such as by a developer's code, without 
using the spreadsheet application itself. 

to [0027] Referring now to FIGURES 2a-2c, block dia- 
grams illustrating a workbook relationship hierarchy for 
various modular parts utilized in the file format 24 for 
representing a workbook according to various illustrative 
embodiments of the invention will be described. The 

45 workbook relationship hierarchy 208 lists specific 
spreadsheet application relationships. Optional relation- 
ships with respect to validation are indicated in italics, 
and dashed connecting lines 203 indicate a one to po- 
tentially many relationship. Thus, for example there is a 

so worksheet part 21 7 for each worksheet associated with 
a workbook 202. 

[0028] The various modular parts or components of 
the presentation hierarchy 208 are logically separate but 
are associated by one or more relationships. Each mod- 
ss ular part is also associated with a relationship type and 
is capable of being interrogated separately with or without 
the spreadsheet application program 10 and/or with or 
without other modular parts being interrogated. Thus, for 
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example, it is easier to locate the contents of a worksheet 
cell because instead of searching through all the binary 
records for cell information, code can be written to easily 
inspect the relationships in a workbook and find the work- 
sheet parts, effectively ignoring the other features and 
data in the file format 24. Thus, the code is written to step 
through the cells in a much simpler fashion than previous 
interrogation code. Further, 'authoring' scenarios, where 
a developer writes code to insert a new part, or to insert 
a completely new file without running the spreadsheet 
application, are simplified due to the modular part file 
format. 

[0029] A modular content framework may include afile 
format container 204 associated with the modular parts. 
The modular parts include, the workbook part 202 oper- 
ative as a guide for properties of the workbook and the 
worksheet part 217 associated with the workbook part 
202 and operative to specify a definition of cells within a 
worksheet associated with the worksheet part 21 7. The 
workbook hierarchy 208 may also include a document 
properties part 205 containing built-in properties associ- 
ated with the file format 24, and a thumbnail part 207 
containing a thumbnail associated with the file format 24. 
[0030] The modular parts also include a sheet part 21 0 
containing data associated with a macro sheet, a chart 
sheet part 212 containing data associated with defining 
a chart, and a style sheet part 240 representing a theme 
in the workbook. It should be appreciated that each mod- 
ular part is capable of being extracted from or copied 
from the workbook and reused in a different workbook 
along with associated modular parts identified by travers- 
ing relationships of the modular part reused. Associated 
modular parts are identified when the spreadsheet ap- 
plication 10 traverses inbound and outbound relation- 
ships of the modular part reused. 
[0031] Other modular parts may include a style part 
220 containing data associated with a style at a cell level 
in the workbook, a dialog sheet part 21 4 containing data 
associated with workbook dialog, a markup maps part 
218 containing visuals depicting a markup language for- 
mat associated with the workbook, and a shared strings 
part 222 containing a string associated with a plurality of 
cells in the workbook. Still other modular parts include a 
workbook connections part 224 containing data associ- 
ated with interfacing with the workbook, a background 
picture part 225, a mail envelope part 242 containing en- 
velope data where a user of the workbook has sent the 
workbook via electronic mail, a code file part 244 con- 
taining code associated with the workbook, and a com- 
ments part 247 containing comments associated with the 
workbook. 

[0032] Still further, the modular parts may include a 
schemas part 254 containing schemas associated with 
the markup maps part 21 8, an image part 248 containing 
image data associated with the workbook, and an em- 
bedded object part 230 containing an object associated 
with the workbook. Other modular parts may also include 
a user data part 245 containing customized data capable 



of being read into the workbook and changed, a drawing 
object part 257 containing an object built using a drawing 
platform, a legacy drawing object part 252, such as an 
Escher 1 .0 object, a table index part 232 containing data 
s defining a table index associated with the worksheet, and 
a list part 228 containing data defining a list associated 
with the worksheet. As an example, embodiments of the 
present invention make it easier to locate a list in a work- 
book because any list has a list part 228 separate in the 
10 file format 24 with corresponding relationships ex- 
pressed. The list part 228, as are other modular parts, is 
logically broken-out and separate from other features & 
data of the workbook. Further, because the logical struc- 
ture of a list is clearly understood, it is also less compli- 
es cated to add more rows of data to a list. 

[0033] Other modular parts associated with the work- 
book may include a pivot table part 234 containing data 
defining a pivot table associated with the worksheet, a 
pivot cache definition part 235 containing data defining 
20 acacheassociatedwiththepivottable, and a pivot cache 
records part 237 containing data associated with the pivot 
cache definition part. A pivot table is a program tool that 
allows selected columns and rows of data in a spread- 
sheet or database table to be reorganized and summa- 
25 rized in order to obtain a desired report. A pivot table 
turns the data to view it from different perspectives. It 
should be appreciated that modular parts that are shared 
in more than one relationship are typically only written to 
the file once. It should also be appreciated that certain 
30 modular parts are global and thus, can be used anywhere 
in the file format. In contrast, some modular parts are 
non-global and thus, can only be shared on a limited ba- 
sis. 

[0034] In various embodiments of the invention, the 
35 file format 24 may be formatted according to extensible 
markup language ("XML") and/or a binary format. As is 
understood by those skilled in the art, XML is a standard 
format for communicating data. In the XML data format, 
a schema is used to provide XML data with a set of gram- 
io matical and datatype rules governingthetypesandstruc- 
ture of data that may be communicated. The XML data 
format is well-known to those skilled in the art, and there- 
fore not discussed in further detail herein. The XML for- 
matting closely reflects the internal memory structure of 
45 an entire workbook. Thus, an increase in load and save 
speed is evident. 

[0035] Embodiments of the present invention make 
workbooks more programmatically accessible. This en- 
ables a significant number of new uses that are simply 

50 too hard for previous file formats to accomplish. For in- 
stance, utilizing embodiments of the present invention, 
a server-side program is able to create a workbook for 
someone based on their input. For example, creating an 
analysis report on Company A for the time period of 

55 1/1/2004-12/31/2004 where all variable input is italicized. 
[0036] Other examples include, an external process 
scanning and rewriting all workbooks on a network in 
orderto update a company logo and visual color scheme, 
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a chart in one workbook being programmatically copied 
and inserted into another workbook, and calculation re- 
sults being retrieved from a workbook and updated to a 
database nightly. Still another example is a government 
agency can process workboo ks, an d mo re easi ly co n vert 
the features and data to their internal text-based format. 
[0037] FIGURES 2a-2c also include relationship types 
utilized in the file format 24 according to various illustra- 
tive embodiments of the invention. The relationship types 
associated with the modular parts not only identify an 
association or dependency but also identify the basis of 
the dependency. The relationship types include the fol- 
lowing: acode file relationship capable of identifying code 
files, a user data relationship, a style sheet relationship, 
a comments relationship, an embedded object relation- 
ship, a drawing object relationship, an image relationship, 
a sound relationship, a mail envelope relationship, a doc- 
ument properties relationship, a thumbnail relationship, 
aschema relationship, achart sheet relationship, a dialog 
sheet relationship, a worksheet relationship, and a pivot 
table relationship, a shared string relationship, a lists re- 
lationship, a pivot cache relationship, a styles relation- 
ship, a markup maps relationship, and a pivot cache 
metadata relationship 

[0038] Referring to FIGURE 2c also illustrates the list- 
ing 259 that lists collection types for organizing the mod- 
ular parts. The collection types include a chart sheet col- 
lection including the chart sheet part 21 2, a dialog sheet 
collection including the dialog sheet part 214, a work- 
sheets collection including the worksheet part 217, a piv- 
ots collection including the pivot table part 234, and a 
pivot cache collection including the pivot cache definition 
part 235 and the pivot cache records part 237. The col- 
lection types also include a styles collection including the 
styles sheet part 240 and the styles part 220, a markup 
maps collection including the markup maps part 218, a 
lists collection including the lists part 228, and an em- 
beddings collection including the embedded object part 
230 and the user data part 245. 
[0039] FIGURES 3-4 are illustrative routines per- 
formed in representing workbooks in a modular content 
framework according to illustrative embodiments of the 
invention. When reading the discussion of the routines 
presented herein, itshould be appreciated thatthe logical 
operations of various embodiments of the present inven- 
tion are implemented (1) as a sequence of computer im- 
plemented acts or program modules running on a com- 
puting system and/or (2) as interconnected machine logic 
circuits or circuit modules within the computing system. 
The implementation is a matter of choice dependent on 
the performance requirements of the computing system 
implementing the invention. Accordingly, the logical op- 
erations illustrated in FIGURES 3-4, and making up the 
embodiments of the present invention described herein 
are referred to variously as operations, structural devic- 
es, acts or modules. It will be recognized by one skilled 
in the art that these operations, structural devices, acts 
and modules may be implemented in software, in 



firmware, in special purpose digital logic, and any com- 
bination thereof without deviating from the spirit and 
scope of the present invention as recited within the claims 
set forth herein. 

5 [0040] Referring now to FIGURES 2a-2c and 3, the 
routine 300 begins at operation 304, where the spread- 
sheet application program 10 writes the workbook part 
202. The routine 300 continues from operation 304 to 
operation 305, where the spreadsheet application pro- 

10 gram 10 queries the workbook for worksheet relation- 
ships. Next, at operation 307, the spreadsheet applica- 
tion writes the worksheet parts 217 referenced in the 
workbook part 202 and establishes relationships be- 
tween each worksheet part 217 and the workbook part 

15 202. 

[0041 ] Next, at operation 308, the spreadsheet appli- 
cation 1 0 writes other modular parts associated with re- 
lationship types, such as the image part 248, and the 
schema part 254. Any modular partto be shared between 

20 other modular parts is written only once. The routine 300 
then continues to operation 310. 
[0042] At operation 310, the spreadsheet application 
10 establishes relationships between newly written and 
previously written modular parts. The routine 300 then 

25 terminates at return operation 312. 

[0043] Referring now to FIGURE 4, the routine 400 for 
writing modular parts will be described. The routine 400 
begins at operation 402 where the spreadsheet applica- 
tion 10 examines data in the spreadsheet application. 

so The routine 400 then continues to detect operation 404 
where a determination is made as to whether the data 
has been written to a modular part. When the data has 
not been written to a modular part, the routine 400 con- 
tinues from detect operation 404 to operation 405 where 

35 the spreadsheet application writes a modular part includ- 
ing the data examined. The routine 400 then continues 
to detect operation 407 described below. 
[0044] When at detect operation 404, if the data ex- 
amined has been written to a modular part, the routine 

io 400 continues from detect operation 404 to detect oper- 
ation 407. At detect operation 407 a determination is 
made as to whether all the data has been examined. If 
all the data has been examined, the routine 400 returns 
control to other operations at return operation 41 2. When 

45 there is still more data to examine, the routine 400 con- 
tinues from detect operation 407 to operation 41 0 where 
the spreadsheet application 10 points to other data. The 
routine 400 then returns to operation 402 described 
above. 

50 [0045] Based on the foregoing, it should be appreciat- 
ed thatthe various embodiments of the invention include 
file formats, methods and computer program products 
for representing workbooks in a modular content frame- 
work. The above specification, examples and data pre- 
ss vide a complete description of the manufacture and use 
of the composition of the invention. Since many embod- 
iments of the invention can be made without departing 
from the spirit and scope of the invention, the invention 
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resides in the claims hereinafter appended. 



A file format for representing a workbook within a 
spreadsheet application, the file format representing 
the workbook in a modular content framework im- 
plemented within a computing apparatus, the file for- 
mat comprising: 



modular parts logically separate but associated 
by one or more relationships wherein each mod- 
ular part is associated with a relationship type 
and wherein the modular parts include: 

a workbook part operative as a guide for 
properties of the workbook; and at least one 
of the following: 

a worksheet part associated with the work- 
book part and operative to specify a defini- 
tion of cells within a worksheet associated 
with the worksheet part; 
a sheet part containing data associated with 
a macro sheet; 

a chart sheet part containing data associat- 
ed with defining a chart; and 
a dialog sheet part containing data associ- 
ated with workbook dialog; 

wherein each modular part is capable of being inter- 
rogated separately withoutthe spreadsheet applica- 
tion and without other modular parts being interro- 
gated. 

2. The file format of claim 1 , wherein the modular con- 
tent framework includes a file format container as- 
sociated with the modular parts wherein the modular 
parts further include: 

a document properties part containing built-in 
properties associated with the file format; and 
a thumbnail part containing thumbnails associ- 
ated with the file format. 

3. The file format of claim 1 , wherein each modular part 
is capable of being one of extracted from and copied 
from the workbook and reused in a different work- 
book along with associated modular parts identified 
by traversing relationships of the modular part re- 
used. 

4. The file format of claim 3, wherein the modular parts 
further include at least one of the following: 

a style sheet part representing a theme in the 
workbook; 

a style part containing data associated with a 



style at a cell level in the workbook; 
a comments part containing comments associ- 
ated with the workbook; 
a markup maps part containing visuals depicting 
a markup language format associated with the 
workbook; 

a schemas part containing schemas associated 
with the markup maps part; 
a shared strings part containing a string associ- 
ated with a plurality of cells in the workbook; 
a workbook connections part containing data as- 
sociated with interfacing with the workbook; 
a mail envelope part containing envelope data 
where a user of the workbook has sent the work- 
book via electronic mail; and 
a code file part containing code associated with 
the workbook. 

The file format of claim 4, wherein the modular parts 
further include at least one of the following: 

an image part containing image data associated 
with the workbook; 

an embedded object part containing an object 
associated with the workbook; 
a user data part containing customized data ca- 
pable of being read into the workbook and 
changed; and 

a drawing object part containing an object built 
using a drawing platform. 

The file format of claim 5, wherein the modular parts 
further include at least one of the following: 

a table index part containing data defining a ta- 
ble index associated with the worksheet; 
a list part containing data defining a list associ- 
ated with the worksheet; 
a pivot table part containing data defining a pivot 
table associated with the worksheet; 
a pivot cache definition part containing data de- 
fining a cache associated with the pivot table; 
and 

a pivot cache records part containing data as- 
d with the pivot cache definition part. 



The file format of claim 6, wherein at least some of 
the modular parts are organized in collection types 
and wherein the collection types include at least one 
of the following: 

a chart sheet collection including the chart sheet 
part; 

a dialog sheet collection including the dialog 
sheet part; 

a worksheets collection including the worksheet 
part; 

a pivots collection including the pivot table part; 



7 



13 

a pivot cache collection including at least one of 
the pivot cache definition part and the pivot 
cache records part; 

a styles collection including at least one of the 
styles sheet part and the styles part; s 
a markup maps collection including the markup 
maps part; 

a lists collection including the lists part; and 
an embeddings collection including at least of 
the embedded object part and the user data part. « 

8. The file format of claim 3, where the relationship 
types associated with the modular parts comprises 
at least one of a code file relationship capable of 
identifying code files, a user data relationship, a hy- » 
perlinkrelationship, astylesheet relationship, acom- 
ments relationship, an embedded object relation- 
ship, a drawing object relationship, an image rela- 
tionship, a mail envelope relationship, a document 
properties relationship, a thumbnail relationship, a 20 
schema relationship, a chart sheet relationship, a 
dialog sheet relationship, a worksheet relationship, 
and a pivot table relationship, a shared string rela- 
tionship, a lists relationship, a pivot cache records 
relationship, a styles relationship, a markup maps 25 
relationship, and a pivot cache metadata relation- 
ship. 

9. The file format of claim 3, wherein content of the 
worksheet is capable of being one of extracted from 30 
and copied from the workbook and reused in a dif- 
ferent workbook. 

1 0. The file format of claim 3, wherein each modular part 

is capable of being locked separately while the other 35 
modular parts remain available for locking whereby 
multiple editors may each concurrently edit a mod- 
ular part of the file format. 

11. The file format of claim 3, wherein the modular parts 10 
are capable of providing semantics about content 
within the workbook when a modular part is interro- 
gated. 

12. The file format of claim 3, wherein at least one of the 45 
modular parts is authored in an authoring context of 

the workbook and wherein the at least one modular 
part in the workbook is capable of being one of ex- 
tracted from and copied from the workbook and 
moved to a different workbook and wherein the at so 
least one modular part is further capable of being 
altered to appear as though the at least one modular 
part was authored in an authoring context of the dif- 
ferent workbook. 

55 

13. The file format of claim 3, wherein the file format is 
capable of providing a high-resolution thumbnail pre- 
view of each worksheet in the workbook. 
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14. The file format of claim 1 , wherein the file format is 
formatted according to at least one of a markup lan- 
guage format and a binary format. 

15. A method for representing a workbook in afile format 
wherein modular parts associated with the workbook 
include each part written into the file format, the 
method comprising: 

writing a workbook part of the file format; 
querying the workbook part for a worksheet re- 
lationship type; 

writing a worksheet part of the file format sepa- 
rate from the workbook part; and 
establishing a relationship between the work- 
sheet part and the workbook part. 

16. The method of claim 15, further comprising estab- 
lishing a relationship between the workbook part and 
a file format container wherein the file format con- 
tainer includes at least one of the following: 

a document properties part containing built in 
properties associated with the file format; and 
a thumbnail part containing a thumbnail associ- 
ated with the file format. 

17. The method of claim 15, further comprising: 

writing other modular parts associated with re- 
lationship types wherein the other modular parts 
that are to be shared are written only once; and 
establishing relationships to the other modular 
parts written. 

18. The method of claim 17, wherein writing the other 
modular parts associated with the relationship types 
comprises: 

a) examining data associated with the work- 
book; 

b) determining whether the data examined has 
been written to a modular part; 

c) when the data examined has not been written 
to the modular part, writing the modular part to 
include the data examined, examining other da- 
ta associated with the workbook, and repeating 
b) through d); and 

d) when the data examined has been written to 
the modular part, examining other data associ- 
ated with the workbookand repeating b) through 
d). 

19. A computer program product comprising a compu- 
ter-readable medium having control logic stored 
therein for causing a computer to represent a work- 
book in afile format comprising modular parts where- 
in the modular parts of the file format include each 
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part written into the file format, the control logic com- 
prising computer-readable program code for causing 
the computer to: 

write a workbook part of the file format; s 
query the workbook part for a worksheet rela- 
tionship type; 

write a worksheet part of the file format separate 
from the workbook part; and 
establish a relationship between the worksheet 10 
part and the workbook part. 

20. The computer program product of claim 1 9, wherein 
the file format comprises at least one of a markup 
file format written in a markup language anda binary « 
format. 
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REPRESENT WORKBOOK J 
^WRITE WORKBOOK PART f~ 304 



RELATIONSHIPS 



WRITE WORKSHEET PARTS AND ESTABLISH 
RELATIONSHIPS 



WRITE MODULAR PARTS ASSOCIATED 
WITH RELATIONSHIP TYPES AND WRITE 
SHARED MODULAR PARTS ONLY ONCE 
(SEE ROUTINE 400 FIG. 4) 



ESTABLISH RELATIONSHIPS BETWEEN 
NEWLY WRITTEN AND PREVIOUSLY 
WRITTEN MODULAR PARTS 



X 



312 



Fig. 3 
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